Discriminatively trained phoneme confusion model for keyword spotting

نویسندگان

  • Panagiota Karanasou
  • Lukás Burget
  • Dimitra Vergyri
  • Murat Akbacak
  • Arindam Mandal
چکیده

Keyword Spotting (KWS) aims at detecting speech segments that contain a given query within large amounts of audio data. Typically, a speech recognizer is involved in a first indexing step. One of the challenges of KWS is how to handle recognition errors and out-of-vocabulary (OOV) terms. This work proposes the use of discriminative training to construct a phoneme confusion model, which expands the phonemic index of a KWS system by adding phonemic variation to handle the abovementioned problems. The objective function that is optimized is the Figure of Merit (FOM), which is directly related to the KWS performance. The experiments conducted on English data sets show some improvement on the FOM and are promising for the use of such technique.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Keyword Spotting Based on Phoneme Confusion Matrix

For many practical applications of keyword spotting, input signal is a spontaneous conversation while the acoustic model was trained with read speech because of data availability. Generally speaking, keyword spotting system will degrade significantly because of mismatch between acoustic model and spontaneous speech. To solve this problem, this paper presents a two-pass keyword spotting strategy...

متن کامل

Keyword Spotting in A-capella Singing

Keyword spotting (or spoken term detection) is an interesting task in Music Information Retrieval that can be applied to a number of problems. Its purposes include topical search and improvements for genre classification. Keyword spotting is a well-researched task on pure speech, but state-of-the-art approaches cannot be easily transferred to singing because phoneme durations have much higher v...

متن کامل

Phoneme Based Acoustics Keyword Spotting in Informal Continuous Speech

This paper describes several ways of keywords spotting (KWS), based on Gaussian mixture (GM) hidden Markov modelling (HMM). Context-independent and dependent phoneme models are used in our system. The system was trained and evaluated on informal continuous speech. We used different complexities of KWS recognition networks and different types of phoneme models. The impact of these parameters on ...

متن کامل

Bootstrapping a System for Phoneme Recognition and Keyword Spotting in Unaccompanied Singing

Speech recognition in singing is still a largely unsolved problem. Acoustic models trained on speech usually produce unsatisfactory results when used for phoneme recognition in singing. On the flipside, there is no phonetically annotated singing data set that could be used to train more accurate acoustic models for this task. In this paper, we attempt to solve this problem using the DAMP data s...

متن کامل

Phoneme-Lattice to Phoneme-Sequence Matching Algorithm Based on Dynamic Programming

A novel phoneme-lattice to phoneme-sequence matching algorithm based on dynamic programming is presented in this paper. Phoneme lattices have been shown to be a good choice to encode in a compact way alternative decoding hypotheses from a speech recognition system. These are typically used for the spoken term detection and keyword-spotting tasks, where a phoneme sequence query is matched to a r...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012